摘要 :
Recent developments in information diffusion model for social network have not taken into account its topological structures. Characteristics such as the degree of connections and clustering of nodes in a network are known to infl...
展开
Recent developments in information diffusion model for social network have not taken into account its topological structures. Characteristics such as the degree of connections and clustering of nodes in a network are known to influence the speed of information propagation. Yet, existing models (such as SIR with an average probability to repost received message) are not sophisticate enough to reflect the fine-grain characteristics. Differences among nodes are often overlooked, leading to inaccurate description of the information dissemination process. In this work, a new approach to predict the information diffusion probability in social network is studied. We combine the Random Forest classification and the SIR model together to analyze the dissemination of information in Weibo. Python crawlers are employed to obtain a total of 316,329 microblogs concerning major news events in 2018, together with related features of nodes from Sina Weibo. The unbalanced positive and negative repost behavior together with 15 features that characterize the nodes and edges data are rebalanced by SMOTE resampling, then used to train a Random Forest classifier to predict individual user's forwarding behavior. For comparison, we find the performance of the Random Forest classifier, judging from the AUC of receiver operating characteristic (ROC) curve, is higher than a comparable SVM model. Finally, a Susceptible Infected Recovered (SIR) information propagation model with the forwarding rates obtained from the Random Forest classifier as input parameter is used to simulate the information dissemination process of Weibo. The predicted time behaviors of the Susceptible, Infected, and Recovered populations are in good agreement with real-life data obtained from Sina Weibo.
收起
摘要 :
This paper proposes a new storage system architecture called Net-RAID that establishes a direct network connection to clients based on the RAID. Separation of control and data messages is used to avoid memory-to-memory data copies...
展开
This paper proposes a new storage system architecture called Net-RAID that establishes a direct network connection to clients based on the RAID. Separation of control and data messages is used to avoid memory-to-memory data copies between file server and storage subsystems. A mass storage system with scalable bandwidth and capacity is constructed based on the Net-RAIDs. The system bandwidth continuously and dynamically increases with the expansion of storage subsystem capacity. We construct system prototype and implement in FTP service system of our lab. Test result gives useful insights into the system performance behavior.
收起